Short-term wind speed prediction based on improved Hilbert–Huang transform method coupled with NAR dynamic neural network model

Wind energy, as a renewable energy source, offers the advantage of clean and pollution-free power generation. Its abundant resources have positioned wind power as the fastest-growing and most widely adopted method of electricity generation. Wind speed stands as a key characteristic when studying wind energy resources. This study primarily focuses on predictive models for wind speed in wind energy generation. The intense intermittency, randomness, and uncontrollability of wind speeds in wind power generation present challenges, leading to high development costs and posing stability challenges to power systems. Consequently, scientifically forecasting wind speed variations becomes imperative to ensure the safety of wind power equipment, maintain grid integration of wind power, and ensure the secure and stable operation of power systems. This holds significant guiding value and significance for power production scheduling institutions. Due to the complexity of wind speed, scientifically predicting its fluctuations is crucial for ensuring the safety of wind power equipment, maintaining wind power integration systems, and ensuring the secure and stable operation of power systems. This research aims to enhance the accuracy and stability of wind speed prediction, thereby reducing the costs associated with wind power generation and promoting the sustainable development of renewable energy. This paper utilizes an improved Hilbert–Huang transform (HHT) using complementary ensemble empirical mode decomposition (CEEMD) to overcome issues in the traditional empirical mode decomposition (EMD) method, such as component mode mixing and white noise interference. Such an approach not only enhances the efficiency of wind speed data processing but also better accommodates strong stochastic and nonlinear characteristics. Furthermore, by employing mathematical analytical methods to compute weights for each component, a dynamic neural network model is constructed to optimize wind speed time series modeling, aiming for a more accurate prediction of wind speed fluctuations. Finally, the optimized HHT-NAR model is applied in wind speed forecasting within the Xinjiang region, demonstrating significant improvements in reducing root mean square errors and enhancing coefficient of determination. This model not only showcases theoretical innovation but also exhibits superior performance in practical applications, providing an effective predictive tool within the field of wind energy generation.

their prediction on improved grey relational analysis of wind speed spatiotemporal correlations within turbine clusters.This resulted in the division of typical turbine multi-order neighborhoods and the reconstruction of wind speed information matrices for prediction.Jiang et al. 24 assessed time series stationarity through time series charting and augmented Dickey-Fuller tests.They employed LSTM and GRU neural networks for wind speed prediction, yielding favorable predictive results.Wang et al. 25 clustered wind turbine units using density peak algorithms, followed by applying long short-term memory network models for wind speed prediction within similar turbine clusters.Experimental results demonstrated the high predictive accuracy of this method.
In existing research, wind speed prediction mainly involves decomposing the historical measurement data of wind farms to establish predictive learning models.The prediction results of each component are directly superimposed, often overlooking the impact of IMF components with different frequencies on the prediction outcome.To address this, this paper proposes a mathematical analytical model weight optimization method to analyze the weight coefficients of each IMF component.Subsequently, an optimized HHT-NAR daily-level short-term wind speed prediction model is established.Firstly, the complementary ensemble empirical mode decomposition (CEEMD) method is introduced to decompose the original wind speed sequence.Then, the Hilbert transform (HT) method is employed to conduct spectral analysis on the obtained intrinsic mode functions (IMFs), taking advantage of its capabilities in handling nonlinearity and non-stationarity in signals.Different neural network models are established for prediction based on their respective spectral characteristics.Finally, the mathematical analytical model is utilized to determine the weight coefficients of each component.The final prediction result is obtained by weighted summation of the prediction results of each component.This methodology is applied to two different wind-rich areas in Xinjiang.Comparative analysis with other models is conducted to validate the reliability of the proposed model.The results indicate that the optimized model proposed in this paper has shown a significant improvement in reducing root mean square error and enhancing coefficient of determination.The results indicate that the optimized model proposed in this paper has shown a significant improvement in reducing root mean square error and enhancing coefficient of determination.This optimized model not only demonstrates theoretical innovation but also elevates the performance and generalizability of the model.It provides an effective predictive tool for the sustainable utilization of regional wind energy resources, potentially addressing the complex challenges faced by wind power generation.

An improved Hilbert-Huang transform method
The wind speed sequence data obtained in practice is often challenging to directly employ for prediction using neural network models due to potential inclusion of outliers and random noise 26 .Utilizing this data for prediction without preprocessing often results in significant errors.Therefore, it is imperative to employ data denoising techniques to enhance prediction accuracy.The Hilbert-Huang transform (HHT) is a processing method proposed by Huang et al. in 1998 for nonlinear, nonsmooth signals 27 , which is easy to implement, intuitive and efficient, with adaptivity, good time-frequency aggregation, completeness and reconfigurability 28 .The HHT theory consists of two parts, empirical mode decomposition (EMD) and Hilbert transform (HT).EMD method is one of the most commonly used techniques for data denoising.It decomposes complex time series data into several intrinsic mode functions (IMFs) and a trend component, enabling the application of Hilbert transform on each IMF.This allows for the derivation of the Hilbert spectrum for each component, ultimately facilitating the exploration and utilization of frequency domain characteristics of wind speed sequences.The method assumes that any complex signal consists of simple IMFs and each IMF component is independent of each other 29 .The specific procedure for decomposing the original wind speed time series is as follows.
1. Identify all the extreme and minimal points of the signal V(t) and fit the upper and lower envelopes of the original wind speed time series with the cubic spline function, respectively.2. Calculate the mean value m 1 (t) of the upper and lower envelopes and subtract m 1 (t) from V(t) to obtain h 1 (t).3.If h 1 (t) satisfies the IMF condition, note that c 1 (t) = h 1 (t), then c 1 (t) is the first IMF component and is the highest frequency component of the original wind speed time series; otherwise, treat h 1 (t) as a new V(t) and repeat the above steps K times until h 1 (t) satisfies the IMF condition.4. Separating c 1 (t) from the original signal to obtain the residual component.
Using r 1 (t) as the new original data, the above steps are repeated to obtain n IMF components and 1 residual component, and the results are as follows.
When r n (t) satisfies the given termination condition, the decomposition process ends, and the original wind speed time series can be expressed as: where r n (t) is a residual function, representing the average trend of the signal; c i (t) represents the components of signal at different time characteristic scales, and its scale increases successively from c 1 (t) to c n (t).The fractional termination condition Cosey convergence criterion proposed by Huang et al. is used, i.e., the standard deviation coefficient (SD) of the results of 2 consecutive IMF screening series is used as a criterion for judging, here denoted by S d , and defined as follows.
where α is a pre-set sufficiently small value, when S d is less than or equal to α, the screening process is terminated; T is the number of original wind speed time series.For an arbitrary time series X(t), the Hilber transform is defined as the convolution Y(t) of X(t) with 1/t.
where P is the Kersey principal value, and the transformation holds for all L P classes; τ is the integration variable; t is the current moment.
According to the above definition, the complex conjugate analytic signal Z(t) can be obtained from X(t) and Y(t).

Among them:
At this point, the instantaneous frequency is as follows: However, when the components of a frequency band of the signal are discontinuous, or when there is intermittent noise interference, EMD may suffer from modal aliasing, which destroys the physical meaning implied by each IMF and reduces the decomposition accuracy 30 .In order to avoid the different modes of the signal in the decomposition process together, Complementary Ensemble Empirical Mode Decomposition (CEEMD) will be a pair of opposite numbers of positive and negative white noise as an auxiliary noise added to the source signal, in order to eliminate the original EMD decomposition of the reconstructed signal in the excess of the auxiliary white noise, and at the same time to reduce the decomposition of the number of iterations required to reduce the cost of computation 31 .CEEMD can be a very good solution to the phenomenon of mode mixing.The specific decomposition steps are as follows.
1.A pair of white noise with opposite sign and zero mean is randomly added to the original time series Xt to obtain two new series M 1 and M 2 , one of which is denoted by ω t .
2. The EMD algorithm is used to decompose M 1 and M 2 respectively to obtain two sets of IMF components and residual terms.3. Repeat the above steps N times, N = 0, 1, 2,…, and the eigenmodal components of the CEEMD decomposition can be obtained by taking the mean value of the overall 2N modal components generated.
where IMFj denotes the IMF component of the decomposed sequence, 1 ≤ j ≤ m.The jth IMF component of the ith sequence is denoted by IMFij, and Ri denotes the residual term of the ith sequence. (3)

NAR dynamic neural network
The NAR dynamic neural network is based on the NAR nonlinear autoregressive model, which uses itself as a regressor variable and represents the random variables at a subsequent moment by a linear combination of random variables over time 32 .NAR, as a time series-based dynamic neural network, offers more than a mere static mapping in its output.It involves the comprehensive utilization of previous dynamic outcomes, enabling interconversion with feedforward networks.Hence, it possesses feedback and memory capabilities, demonstrating superior performance compared to feedforward neural networks 33 .
The NAR neural network model can be described as: where y(t) is the input value at the current moment, y(t − 1), y(t − 2),… , y(t − d) is the output value at the historical moment, and d is the delay order.NAR dynamic neural networks generally consist of an input layer, a time lag layer, a hidden layer and an output layer.As shown in Fig. 1, the data y(t) is input from the input layer, processed, trained, and learned through the time lag and hidden layers, Finally, the output layer outputs the prediction results.where y(t) is the input data, y 0 (t) is the output data, 1:10 is the delay order, W is the connection weight, and b is the threshold value.
The NAR dynamic neural network model differs from other network models in two main aspects: Firstly, both the input and output values of this model are y(t).Secondly, it includes input delays within the hidden layer.The NAR dynamic neural network, in essence, is a static neural network with an input delay function where the order of delay determines the number of inputs for the neural network.Therefore, to enhance the accuracy of predictions using the NAR dynamic neural network, adjustments can be made by tuning the delay order, the number of nodes within the hidden layer, and the quantity of neurons.
The general steps for establishing a dynamic neural network model like NAR are outlined as follows: Step 1: Apply the Cross-validation method to compute the bias and scaling factors for the decomposed wind speed subsequence data, determining the model's order.
Step 2: Partition the wind speed subsequence data into training, validation, and testing sets with specified proportions.Simultaneously, utilize MATLAB's 'preparets' function for data transformation.
Step 3: Define the number of neurons in the hidden layer and the delay order.
Step 4: Train the neural network model using in-sample data and assess the fitting effect through error autocorrelation plots.Repeat the process if the fitting criteria are not met.
Step 5: Preserve the trained neural network and proceed with data prediction, observing prediction errors.Through the aforementioned steps, NAR dynamic neural network models were constructed for different wind speed subsequence data.Executing these models generated predictions for the wind speed subsequence data.However, the conventional approach for predicting wind speed sequences involves summing the predicted values of all components and trend terms to obtain the predicted wind speed sequence.This method overlooks the significant stochastic, uncertain, and nonlinear characteristics of wind speed data during prediction.This study introduces a mathematical analytical model to derive weights for different wind speed subsequence components.These weights are used to aggregate the predictions, enhancing the accuracy of the model's predictions.
The mathematical analytical model is mainly used to calculate the weight coefficients of each component by solving a specific mathematical model 34 , and the specific ideas and solution process are as follows: The wind speed predictions by weight superposition are: where c 1 , . . ., c n , c n+1 is the weighting factor corresponding to each component; f 1p (t), . . .f np (t), r np (t) is the prediction result of each component.
Here, the error sum of squares P is minimized as the objective function, i.e.
Substituting Eq. ( 10) into Eq.( 11) yields: The optimal weighting factor is found by finding the partial derivative of P with respect to each weighting factor such that it satisfies the following constraint.

Optimizing the HHT-NAR model prediction process
Through the intricate process of optimizing the HHT-NAR model, we initiate by pre-processing the wind speed time series to ensure data quality and integrity.Subsequently, we employ the complementary ensemble empirical mode decomposition (CEEMD) for an efficient decomposition of the wind speed time series, acquiring Intrinsic Mode Functions (IMFs) of varying frequencies.These IMF components, processed through the Hilbert-Huang transform (HHT), effectively unveil the nonlinear and non-stationary features embedded within the wind speed signal.Employing mathematical analytical methods, we compute the weighting coefficients for each IMF to more accurately quantify the impact of CEEMD decomposition.Next, utilizing these weighted IMFs, we construct a dynamic Neural Autoregressive (NAR) model.Through training and optimization, the model is adapted to the historical variations in wind speed data.Ultimately, the refined model, post-optimization, can be effectively applied in practical scenarios, enabling precise short-term wind speed predictions.The detailed prediction process is depicted in Fig. 2.
The advanced techniques such as complementary ensemble empirical mode decomposition (CEEMD) and Hilbert-Huang transform (HHT) enable a more precise analysis of the nonlinear and non-stationary features within wind speed time series.These techniques enhance the adaptability of models to complex wind energy data, consequently boosting the accuracy of short-term wind speed predictions.Employing mathematical analytical methods to compute the weighting coefficients for Intrinsic Mode Functions (IMFs) quantifies the precision of CEEMD decomposition, further fortifying the model's robustness.This enhancement allows the model to better adapt to varying wind speed scenarios, thus improving its stability and reliability across different wind speed fluctuations 35,36 .Optimizing the HHT-NAR model workflow facilitates meeting the stability requirements for input signals in power systems, effectively addressing the challenges posed by the stochastic intermittency of natural winds on the power grid.This holds significant importance for the scheduling and control of wind farms, along with ensuring the secure and stable operation of power systems.( 16)

Wind speed sequence experiment
Xinjiang Alashankou wind area is a wind resource rich area, the annual average wind power density in the center of the wind area is > 200 W/m 2 , and the effective wind speed hours are > 5500 h.And it has good wind resource conditions for large wind farms, and the wind speed in the area has no seasonal characteristics.The actual wind speed data of the region from 2021-6-15 to 2021-12-31 were selected for the arithmetic analysis, and the temporal resolution of the data was 1 day.After data processing, the first 80% of the data were taken as the experimental training set and the last 20% as the test set.
To assess the performance of the constructed prediction model, three error evaluation metrics were employed: Root Mean Square Error (RMSE), coefficient of determination (R 2 ), and Mean Absolute Error (MAE) more accurately.RMSE and MAE serve as metrics to gauge prediction errors, where smaller values indicate closer proximity between predicted and actual values, signifying lower prediction errors.A higher R 2 value nearing 1 indicates better fitting of the prediction model.The specific mathematical formulas for these metrics are provided as follows: In the formulas: y i represents the actual value of wind speed at time i. ŷi represents the predicted value of wind speed at time i.N denotes the total length of the wind speed sequence.
The raw wind speed data and its CEEMD decomposition results are shown in Fig. 3, and a total of 6 IMF components and 1 residual component are decomposed.From Fig. 3, it is obvious that the temporal characteristic scale of the IMF component increases sequentially from IMF1 to IMF6, and its frequency changes from high to low.The spectrograms of wind speed and side spectrograms were then analyzed by HHT, as shown in Fig. 4, and the final waveforms of each component and frequency variation with time were obtained (Fig. 5).
The degree of correlation between IMF components and wind speed data was judged according to Pearson correlation coefficient for reconstructing the components, and the correlation results are shown in Table 1.
From the data presented in Table 1, it is evident that the wind speed components IMF1, IMF2, IMF3, IMF4, and the trend term exhibit relatively high correlation coefficients, indicating a stronger association with the original data.However, IMF5 and IMF6 display weaker correlations with the original data.The component reconstruction method with minimum error and simpler operation is used, and the components with large correlation degree are substituted into Input1 ∼ Input 5 components and input to the NAR neural network for prediction, while the IMF5 and IMF6 with smaller correlation degree are summed and reconstructed into a new component Input 6 input to the neural network, and the difference in order of magnitude between IMF5 and IMF6 is not significant, and data normalization is not required.The number of neurons in the hidden layer of each input is first referred to the literature for value selection reference 37 .The result obtained from the empirical formula is used as the initial value, and then the experimental method is used to continuously adjust the value to select the most suitable value.After repeated experimental comparison and analysis, the settings of the hidden layer neurons corresponding to different components in this experiment are shown in Table 2.
The optimal NAR dynamic neural network model for each waveform is established to improve the prediction accuracy, and it can be seen from Fig. 6 that the smoothness of the wind speed time series is improved, and the volatility is significantly reduced after CEEMD decomposition.The prediction errors from Input 1-Input 6 are getting lower and lower, which indicates that the training effect is gradually getting better.
The predicted values of Input 1 to Input 6 were summed to obtain the predicted values of the original wind speed sequence, as depicted in Fig. 7, illustrating their comparison with the actual values.
The computation of three error assessment metrics is shown in Table 3: the optimized model exhibits an RMSE of 18.9818 m/s, an MAE of 14.1063 m/s, and an R 2 value of 0.8827.Smaller values of RMSE and MAE indicate a closer proximity between predicted and actual values, signifying smaller prediction errors.Meanwhile, an R 2 value closer to 1 suggests a better fitting of the prediction model.Thus, the model established in this article is deemed suitable for forecasting actual wind speeds within the studied region.

Day-level wind speed prediction based on optimized HHT-NAR model
In order to better validate the effectiveness of the proposed model, the article selects the long short-term memory neural network (LSTM) model that introduces self-loop on top of RNN, which is capable of capturing long-range dependency and nonlinear information [38][39][40] .A single LSTM and NAR prediction model was developed for the raw wind speed data, and a comparison of the prediction results of several models is shown in Fig. 8.
As seen in Fig. 8, the single LSTM prediction is poor, with a certain time lag, and it is often difficult to track the prediction when the wind speed varies widely.The single NAR prediction is slightly better than LSTM, and there is no lag in the abrupt response to wind speed data, but it cannot predict individual points more accurately.The overall prediction effect of the HHT-based NAR dynamic neural network prediction model is improved, and there is basically no lag, but there is also a large error in individual mutation points, and the optimized HHT-NAR model reduces the error value in wind speed.The results showed that the optimized wind speed   4.
From Table 4, it is evident that the optimized HHT-NAR model constructed in this paper performs the best in predicting results, with RMSE, R 2 , and MAE values of 2.5375 m/s, 0.9215, and 1.8072 m/s, respectively.In comparison, the HHT-NAR dynamic neural network has RMSE, R 2 , and MAE values of 5.9785 m/s, 0.8915, and 2.725 m/s, respectively.When comparing the optimized HHT-NAR model to the HHT-NAR model, there is a decrease of 57.56% and 33.68% in RMSE and MAE, respectively, while R 2 has increased by 3.37%.The optimized HHT-NAR model demonstrates a significant reduction in RMSE and MAE, indicating closer proximity of predicted values to actual values, thereby minimizing prediction errors and further improving prediction accuracy.Comparing the NAR dynamic neural network model to the LSTM model, there is a reduction of 71.9% and 27.6% in RMSE and MAE, respectively, with an increase of 7.5% in R 2 .Furthermore, comparing the

Annual wind speed forecast at Karamay station
In order to further verify the universality and accuracy of the model proposed in this paper, the prediction object in this example is not limited to the wind energy rich area, but also made prediction for the wind energy sub-rich area in Karamay, Xinjiang, using the wind speed data of Karamay wind farm for the years 2021-6-15-2021-12-31 to do short-term wind speed prediction test.A total of 200 data points were selected, and the last 40 data points were selected as the test set.Figure 9 illustrates the comparison between the actual wind speed at the Karamay station and the predictions from various models.
From Table 5, it can be observed that the predictive errors of the three models in the wind energy subenrichment area are smaller compared to the previous station, attributed to lower fluctuations in wind speed sequences in this region.Notably, the optimized HHT-NAR dynamic neural network model proposed in this paper outperforms other comparative models significantly, demonstrating a pronounced effect in improving the root mean square error (RMSE) over the HHT-NAR model.This validates the universality and accuracy of the method proposed in this study.Furthermore, the optimized HHT-NAR dynamic neural network model shows a 5% to 47% enhancement in the coefficient of determination (R 2 ) compared to the other three models, indicating its superior predictive performance.From Figs. 8 and 9, it is evident that all four models generally depict the overall trend of wind speed data over time.However, in comparison to the actual measurements, the standalone LSTM model inadequately captures the finer details of the data.Specifically, it exhibits significant prediction errors for days with higher wind speeds, attributed to the strong random nature of the wind speed sequence that interferes with the model's predictive accuracy.
Comparatively, the predictions made by the NAR model are closer to the actual measurements for days with higher wind speeds and present more detailed data for other days.This is due to the NAR model's better handling of issues related to gradient vanishing and explosion during long sequence training, resulting in more comprehensive performance.
Additionally, the predictive performance of the HHT-NAR model surpasses that of the NAR model.It demonstrates closer alignment with the actual measurements on days with higher wind speeds, indicating that decomposing the wind speed sequence enhances the model's accuracy.The optimized HHT-NAR model exhibits the best predictive performance, accurately forecasting even on days with higher wind speeds.This improvement is attributed to CEEMD, which effectively isolates different fluctuation characteristics within the precipitation sequence.Furthermore, CEEMD decomposes the added noise, reducing reconstruction errors.Using mathematical analysis to predict the decomposed components based on different weights allows the model to capture the changing characteristics of each component more effectively, significantly enhancing prediction accuracy.For future considerations, efforts can be directed towards refining the combined model to converge to optimal precision while maintaining higher predictive stability.4. In further optimizing the HHT-NAR model, the paper acknowledges the need for more in-depth comparative studies.Future work will extend the comparison beyond the machine learning models mentioned in this study.Additionally, exploration into more complex algorithms and models, such as optimizing combined prediction strategies for enhanced precision and stability, is proposed.This endeavor aims to further enhance the accuracy and stability of wind speed prediction, offering a more reliable forecasting tool for wind energy generation.This endeavor aligns with contributing more significantly to the sustainable development of renewable energy sources.

Figure 4 .
Figure 4. Wind speed HHT spectra and side spectra.

Figure 5 .
Figure 5. Waveform and amplitude of wind speed component.

Figure 6 .
Figure 6.Input prediction of each component.

Figure 7 .
Figure 7.Comparison between the optimized model's predicted values and the original wind speed sequence.

Figure 9 .
Figure 9.Comparison of actual wind speed at Karamay station and prediction of each model.

Conclusion 1 .
Facing the intermittence and volatility of wind speed time series, this study utilized the CEEMD algorithm to decompose the wind speed sequence.Combined with the HHT method, this approach unearthed the physical characteristics of wind speed, facilitating the construction of a NAR dynamic neural network model suitable for prediction purposes.2. The application of the CEEMD algorithm effectively reduced the non-stationarity of the original sequence, while the HHT method adeptly uncovered the nonlinear and non-stationary characteristics of the wind speed signal, laying a robust foundation for constructing the NAR dynamic neural network model.Utilizing mathematical analysis to examine the weight coefficients of each component, we successfully quantified the impact of the insufficient precision in the CEEMD decomposition.It is recommended for future work to further optimize the HHT-NAR model, especially in the selection of mathematical analysis models, employing more refined methods to enhance the model's fitting capability.3. The optimized HHT-NAR dynamic neural network model constructed in this study has achieved significant success in wind speed prediction at two sites in Xinjiang.The model demonstrated exceptional performance in fitting the majority of wind speed data transition points, reducing RMSE and MAE in both wind-rich and wind-limited areas, exhibiting excellent fitting accuracy.

Table 1 .
Results of Pearson correlation coefficient analysis.

Table 2 .
Number of neurons in the hidden layer corresponding to different components.

Table 3 .
The error assessment results for the optimized model.Comparison of actual wind speed at Alashankou station with predictions from various models.

Table 4 .
The error assessment results for different prediction models.

Table 5 .
Error evaluation results of different prediction models.